Longitudinal data analysis

During this computer lab we will

  1. Try to reproduce the results of the lectures.
  2. Perform the analysis from paragraph 4 of John Fox’s article on Linear Mixed Models.
  3. Analyze data from a trial on recovery after stroke

Before we get started, we’ll first load a few packages we will need for our analysis.

library(foreign)
library(gdata)
library(nlme)
library(psych)
library(ggplot2)
library(car)

If you get an error message that the package is not available, you will first have to install it (you only need to do this once). For instance the foreign library: install.packages("foreign"), or use the menu in RStudio (Tools - Install Packages).

Exercise 1

Repeat the linear mixed models analyses of the Reisby dataset, using time as a continuous variable. There are two versions of the dataset: “wide format” (reisbywide.Rdata), meaning that all observations are in separate rows, and “long format” (reisbylong.Rdata), with observations from different time points on a separate line (so 6 lines per patient). Some of the descriptive analyses are easier to do when the data is in “wide format”, and others when the data is in “long format”. The mixed models need to be run on the data in “long” format.

  1. Do some initial data analysis: get descriptive statistics and make plots of the data (note that most of the descriptive statistics – means, SDs, correlations – are easier to get in the wide version of the data, while the spaghetti plots and individual plots are easier to get from the long version. Also, after loading in the data files, we’ll change id and week to factor variables, and keep ‘time’ (a copy of week) as continuous, also for later use:
load(file = "reisbywide.Rdata")
load(file = "reisbylong.Rdata")
reisby.long$id <- factor(reisby.long$id)
reisby.long$time <- reisby.long$week
reisby.long$week <- factor(reisby.long$week)
head(reisby.wide)
##    id endo hdrs.0 hdrs.1 hdrs.2 hdrs.3 hdrs.4 hdrs.5
## 1 101    0     26     22     18      7      4      3
## 2 103    0     33     24     15     24     15     13
## 3 104    1     29     22     18     13     19      0
## 4 105    0     22     12     16     16     13      9
## 5 106    1     21     25     23     18     20     NA
## 6 107    1     21     21     16     19     NA      6
head(reisby.long, n=10)
##     id hdrs week endo time
## 1  101   26    0    0    0
## 2  101   22    1    0    1
## 3  101   18    2    0    2
## 4  101    7    3    0    3
## 5  101    4    4    0    4
## 6  101    3    5    0    5
## 7  103   33    0    0    0
## 8  103   24    1    0    1
## 9  103   15    2    0    2
## 10 103   24    3    0    3
summary(reisby.wide)
##        id             endo            hdrs.0          hdrs.1     
##  Min.   :101.0   Min.   :0.0000   Min.   :15.00   Min.   :11.00  
##  1st Qu.:303.2   1st Qu.:0.0000   1st Qu.:21.00   1st Qu.:19.00  
##  Median :332.0   Median :1.0000   Median :22.00   Median :22.00  
##  Mean   :324.0   Mean   :0.5606   Mean   :23.44   Mean   :21.84  
##  3rd Qu.:353.8   3rd Qu.:1.0000   3rd Qu.:27.00   3rd Qu.:25.00  
##  Max.   :610.0   Max.   :1.0000   Max.   :34.00   Max.   :39.00  
##                                   NA's   :5       NA's   :3      
##      hdrs.2          hdrs.3          hdrs.4          hdrs.5     
##  Min.   :10.00   Min.   : 0.00   Min.   : 0.00   Min.   : 0.00  
##  1st Qu.:14.00   1st Qu.:12.00   1st Qu.: 9.00   1st Qu.: 6.25  
##  Median :18.00   Median :16.00   Median :12.00   Median :11.00  
##  Mean   :18.31   Mean   :16.42   Mean   :13.62   Mean   :11.95  
##  3rd Qu.:22.00   3rd Qu.:21.00   3rd Qu.:19.00   3rd Qu.:16.75  
##  Max.   :33.00   Max.   :32.00   Max.   :34.00   Max.   :33.00  
##  NA's   :1       NA's   :1       NA's   :3       NA's   :8

Now we’re ready to get some descriptive statistics from the wide form of the dataset. Here I use the describeBy() function (from the psych package). Note that the skew=FALSE is only used to clean up some of the (otherwise excessive) output:

describeBy(x=reisby.wide[,3:8], group=reisby.wide$endo, skew=FALSE)
## 
##  Descriptive statistics by group 
## group: 0
##        vars  n  mean   sd min max range   se
## hdrs.0    1 28 22.79 4.12  15  33    18 0.78
## hdrs.1    2 29 20.48 3.83  11  27    16 0.71
## hdrs.2    3 28 17.00 4.35  10  28    18 0.82
## hdrs.3    4 29 15.34 6.17   0  26    26 1.15
## hdrs.4    5 29 12.62 6.72   0  28    28 1.25
## hdrs.5    6 27 11.22 6.34   1  29    28 1.22
## ------------------------------------------------------------ 
## group: 1
##        vars  n  mean   sd min max range   se
## hdrs.0    1 33 24.00 4.85  17  34    17 0.84
## hdrs.1    2 34 23.00 5.10  15  39    24 0.87
## hdrs.2    3 37 19.30 6.08  10  33    23 1.00
## hdrs.3    4 36 17.28 6.56   7  32    25 1.09
## hdrs.4    5 34 14.47 7.17   2  34    32 1.23
## hdrs.5    6 31 12.58 7.96   0  33    33 1.43

Let’s look at the correlations of HDRS scores over the six time points:

round(cor(reisby.wide[,3:8],use="pairwise.complete.obs"),digits=3)
##        hdrs.0 hdrs.1 hdrs.2 hdrs.3 hdrs.4 hdrs.5
## hdrs.0  1.000  0.493  0.410  0.333  0.227  0.184
## hdrs.1  0.493  1.000  0.494  0.412  0.308  0.218
## hdrs.2  0.410  0.494  1.000  0.738  0.669  0.461
## hdrs.3  0.333  0.412  0.738  1.000  0.817  0.568
## hdrs.4  0.227  0.308  0.669  0.817  1.000  0.654
## hdrs.5  0.184  0.218  0.461  0.568  0.654  1.000

We’ll use the ggplot2 for the spaghetti plot (now we need to use the long form of the dataset). We start by defining the base for the graphs and store in object p. To that, we add geom_line(), which makes a simple spaghetti plot:

p <- ggplot(data = reisby.long, aes(x = week, y = hdrs, group = id))
p + geom_line()
## Warning: Removed 13 row(s) containing missing values (geom_path).

To this graph we add a facet based on the grouping variable (endogenous/exogenous):

p + geom_line() + facet_grid(. ~ endo)
## Warning: Removed 13 row(s) containing missing values (geom_path).

We can make the graph a little fancier by adding the mean HDRS per timepoint:

p + geom_line() + stat_summary(aes(group = 1), geom = "point", fun.y = mean, shape = 17, size = 3) + facet_grid(. ~ endo)
## Warning: `fun.y` is deprecated. Use `fun` instead.
## Warning: Removed 21 rows containing non-finite values (stat_summary).
## Warning: Removed 13 row(s) containing missing values (geom_path).

Finally, we can do a quick check of the functional form of time: add loess curves through the observed means, with confidence bands. If a straight line fits within these bands, it’s not unreasonable to fit a straight line for time:

p <- ggplot(data = reisby.long, aes(x = week, y = hdrs, group = id))
p + geom_line() + stat_smooth(aes(group = 1)) + stat_summary(aes(group = 1), geom = "point", fun.y = mean, 
    shape = 17, size = 3) + facet_grid(. ~ endo)
## Warning: `fun.y` is deprecated. Use `fun` instead.
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## Warning: Removed 21 rows containing non-finite values (stat_smooth).
## Warning: Removed 21 rows containing non-finite values (stat_summary).
## Warning: Removed 13 row(s) containing missing values (geom_path).

Looking at this graph, it seems fairly reasonable to model time as continuous and linear. Of course, you could have just run this last command, instead of running all the graphs in between. The steps in between were to make clearer what the options of ggplot2 are doing.

We can also look at plots of individuals using ggplot2, both without & with linear interpolation lines. The second version of the graph, with interpolated curves, is presented below:

p + geom_point() + geom_line() + facet_wrap(~ id)
## Warning: Removed 21 rows containing missing values (geom_point).
## Warning: Removed 13 row(s) containing missing values (geom_path).

p <- ggplot(data = reisby.long, aes(x = week, y = hdrs, group = id))
p + geom_point() + facet_wrap(~ id) + stat_smooth(method = "lm", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'
## Warning: Removed 21 rows containing non-finite values (stat_smooth).
## Warning: Removed 21 rows containing missing values (geom_point).

Looking at the individual plots, it may or may not be reasonable to model each individual’s trajectory over time as linear. We’ll look into that more later, for now we’ll asssume linear time trends per person.

  1. Can you think of a few possible hypotheses about the effect of endo?

There is a main effect of endo (e.g. people with endogenous depression generally have higher/lower HDRS score than those with exogenous depression). Test H0: endo=0.

There is an interaction between endo and time (e.g. people with endogenous depression have a different time trend for HDRS score than people with exogenous depression). Test H0: time:endo=0.

  1. Repeat the mixed model analyses of the Reisby dataset: model depression score (HDRS) as a function of time (linear), endo/exo and the interaction of the two. Use a model with only a random intercept per patient, and a model with a random intercept plus a random slope for time.

We’ll start with a mixed model with fixed effects for time (for now, we’re modelling time as continuous and linear), endo and the interaction of time*endo, and just a random intercept per person:

lme.ril<-lme(fixed=hdrs ~ time*endo, random=~1|id, data=reisby.long, na.action="na.omit", method="ML")
summary(lme.ril)
## Linear mixed-effects model fit by maximum likelihood
##  Data: reisby.long 
##        AIC      BIC    logLik
##   2294.137 2317.699 -1141.069
## 
## Random effects:
##  Formula: ~1 | id
##         (Intercept) Residual
## StdDev:    3.909812 4.362878
## 
## Fixed effects: hdrs ~ time * endo 
##                 Value Std.Error  DF    t-value p-value
## (Intercept) 22.441628 0.9441419 307  23.769337  0.0000
## time        -2.351842 0.1990804 307 -11.813530  0.0000
## endo         1.992873 1.2702442  64   1.568890  0.1216
## time:endo   -0.044176 0.2720416 307  -0.162387  0.8711
##  Correlation: 
##           (Intr) time   endo  
## time      -0.524              
## endo      -0.743  0.390       
## time:endo  0.384 -0.732 -0.531
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -3.18214138 -0.59376247 -0.03548597  0.53988343  3.48261468 
## 
## Number of Observations: 375
## Number of Groups: 66

Take a good look at the output, and see if you can intepret it. Note that we are using all available data from all 66 subjects (see the last two lines of the output).

Purely optional: you can plot predictions from this model, to see what we predict for everyone on the basis of this model. To do this, we use the variables id, endo and time (columns 1, 2 and 6) from the long version of the file as the prediction data frame (newd). Then we ask R for predictions on the basis of our model for the patients, time points, and level of endo/exo in our prediction data frame. Finally, we graph those predictions against time.

newd <- reisby.long[,c("id", "endo", "time")]
newd$pred1 <- predict(lme.ril, newdata=newd)
pp1 <- ggplot(data = newd, aes(x = time, y = pred1, group = id))
pp1 + geom_line()

The predicted lines for all 66 patients appear to be exactly parallel. This is becuase the time*endo interaction is nearly 0, so the differences in slope between the people with enodgenous and exogenous depression is hardly visible to the naked eye. To get a sense of how the interaction is working, try:

ggplot(data = newd, aes(x = time, y = pred1, group = id, color=factor(endo))) + geom_line()

With the two different colors, you might be able to see that the lines predicted by the model for the two groups aren’t exactly parallel.

Now we will fit a mixed model with fixed effects for time, endo, and their interaction, plus a random intercept & random slope for time(continuous/linear) per subject:

lme.ris<-update(lme.ril, random=~time|id)
summary(lme.ris)
## Linear mixed-effects model fit by maximum likelihood
##  Data: reisby.long 
##        AIC      BIC    logLik
##   2230.929 2262.345 -1107.465
## 
## Random effects:
##  Formula: ~time | id
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev   Corr  
## (Intercept) 3.411897 (Intr)
## time        1.441191 -0.285
## Residual    3.495500       
## 
## Fixed effects: hdrs ~ time * endo 
##                 Value Std.Error  DF   t-value p-value
## (Intercept) 22.476263 0.7986137 307 28.144100  0.0000
## time        -2.365687 0.3134843 307 -7.546430  0.0000
## endo         1.988021 1.0747917  64  1.849680  0.0690
## time:endo   -0.027056 0.4217255 307 -0.064155  0.9489
##  Correlation: 
##           (Intr) time   endo  
## time      -0.451              
## endo      -0.743  0.335       
## time:endo  0.335 -0.743 -0.457
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -2.74478736 -0.49680979  0.03394173  0.49640589  3.62084094 
## 
## Number of Observations: 375
## Number of Groups: 66
intervals(lme.ris)
## Approximate 95% confidence intervals
## 
##  Fixed effects:
##                  lower        est.      upper
## (Intercept) 20.9132175 22.47626306 24.0393087
## time        -2.9792384 -2.36568735 -1.7521363
## endo        -0.1476402  1.98802094  4.1236821
## time:endo   -0.8524565 -0.02705587  0.7983448
## attr(,"label")
## [1] "Fixed effects:"
## 
##  Random Effects:
##   Level: id 
##                            lower       est.      upper
## sd((Intercept))        2.5719245  3.4118965 4.52619738
## sd(time)               1.1299064  1.4411915 1.83823447
## cor((Intercept),time) -0.5670835 -0.2850265 0.05686282
## 
##  Within-group standard error:
##    lower     est.    upper 
## 3.195385 3.495500 3.823801

Again, purely optional: plot the predictions from the model with a random intercept & random slope per subject (using same prediction data frame from above).

newd$pred2 <- predict(lme.ris, newdata=newd)
pp2 <- ggplot(data = newd, aes(x = time, y = pred2, group = id))
pp2 + geom_line()

  1. Which model from (d) do you think fits the data better, and why?

The model with a random slope of time per patient seems to correspond better to the observed data.

  1. Interpret the second model from (c).

  2. Interpretation of the model:

  1. Intercept (22.48) is average HDRS score when all variables = 0, so for patients with exogenous depression (reference) at time = 0.
  2. → patients with exogenous depression start with average of 22.48
  3. Estimate for endo (1.99) is average difference in HDRS between endogenous and exogenous patients at time = 0
  4. → patients with endogenous depression start with average of 22.476 + 1.988 = 24.46
  5. Estimate of random intercept s.d. 3.41 indicates considerable fluctuation around fixed intercepts: patients can start quite a bit higher/lower than average
  6. “Average” slope is -2.37 for patients with exogenous depression
  7. → per time unit (1 week) the HDRS score decreases on average by 2.37 for patients with exogenous depression
  8. Interaction endo:time (-0.027) is difference in slope endogenous vs. exogenous
  9. → per time unit (1 week) the HDRS score decreases on average by 2.39 patients with endogenous depression
  10. Estimate of random slope s.d. 1.44,
  11. → for individuals the slope can be quite a bit steeper or flatter, may even be positive for some patients (as seen in the plot).
  1. Save your script/syntax.

Exercise 2

On page 25 of Mixed-Effects Models in R (Appendix to An R Companion to Applied Regression, Second Edition) by John Fox and Sanford Weisberg, you will find section 2.4, “An Illustrative Application to Longitudinal Data”.

In this exercise you will try to reproduce the results presented there. (Note that you can copy all commands from the article and paste them into R or RStudio.) Concentrate only on the first 3 models and their interpretations. The anova() commands, comparing the models, may be skipped over, as may be the table on page 32 (starting at line 6). Do try out the compareCoefs() function around the middle of page 32!

Whether you choose to skip the anova() commands for now or not, please add method=”ML” to the first lme() command (since the rest of the models are “updated” from the first model, they will all be fit using ML estimation). We’ll come back to this in mixed models part 3. Note that your output will vary a bit from Fox and Weisberg.

Note that you can load the data by means of:

data(Blackmore)
  1. Examine the time variable (age). What is different about this time variable, compared to, say, time in the Reisby data?
data(Blackmore)
head(Blackmore, 20)
##    subject   age exercise   group
## 1      100  8.00     2.71 patient
## 2      100 10.00     1.94 patient
## 3      100 12.00     2.36 patient
## 4      100 14.00     1.54 patient
## 5      100 15.92     8.63 patient
## 6      101  8.00     0.14 patient
## 7      101 10.00     0.14 patient
## 8      101 12.00     0.00 patient
## 9      101 14.00     0.00 patient
## 10     101 16.67     5.08 patient
## 11     102  8.00     0.92 patient
## 12     102 10.00     1.82 patient
## 13     102 12.00     4.75 patient
## 14     102 15.08    24.72 patient
## 15     103  8.00     1.04 patient
## 16     103 10.00     2.90 patient
## 17     103 12.00     2.65 patient
## 18     103 14.08     6.86 patient
## 19     104  8.00     2.75 patient
## 20     104 10.00     6.62 patient
Blackmore$log.exercise <- log2(Blackmore$exercise + 5/60)
pat <- with(Blackmore, sample(unique(subject[group == "patient"]), 20))
Pat.20 <- groupedData(log.exercise ~ age | subject,
          data=Blackmore[is.element(Blackmore$subject, pat),])
con <- with(Blackmore, sample(unique(subject[group == "control"]), 20))
Con.20 <- groupedData(log.exercise ~ age | subject,
          data=Blackmore[is.element(Blackmore$subject, con),])

# Fox's code for the individual plots of 20 control subjects & 20 subjects with anorexia
print(plot(Con.20, main="Control Subjects",
             xlab="Age", ylab="log2 Exercise",
             ylim=1.2*range(Con.20$log.exercise, Pat.20$log.exercise),
             layout=c(5, 4), aspect=1.0),
             position=c(0, 0, 0.5, 1), more=TRUE)
print(plot(Pat.20, main="Patients",
             xlab="Age", ylab="log2 Exercise",
             ylim=1.2*range(Con.20$log.exercise, Pat.20$log.exercise),
             layout=c(5, 4), aspect=1.0),
             position=c(0.5, 0, 1, 1))

# (You could also make these individual plots using ggplot2, only then it's hard to get them side-by-side)
pc <- ggplot(data = Con.20, aes(x = age, y = log.exercise)) + ggtitle("Control Subjects") + ylim(1.2*range(Con.20$log.exercise, Pat.20$log.exercise))
pc + geom_point() + geom_line() + facet_wrap(~subject)

pp <- ggplot(data = Pat.20, aes(x = age, y = log.exercise)) + ggtitle("Patients") + ylim(1.2*range(Con.20$log.exercise, Pat.20$log.exercise))
pp + geom_point() + geom_line() + facet_wrap(~subject)

# it could be done using the ggarrange function from the ggpubr package
ggpubr::ggarrange(pc + geom_point() + geom_line() + facet_wrap(~subject),
                  pp + geom_point() + geom_line() + facet_wrap(~subject))

The time variable (age) is measured more continuously than in the Reisby dataset. Though most girls are measured at ages 8, 10, 12, 14 and 16, this is not always the case. Some girls are measured at ages between these “categories” of age.

  1. Why is age-8 used in the models?

  2. All girls are measured starting at the age of 8, so age=8 becomes the “zero” point for time.

  3. Interpret the coefficients of the 3rd model (bm.lme.3).

# Mixed models
# 1. Fixed: age, group & interaction; Random: intercept, age
bm.lme.1 <- lme(log.exercise ~ I(age - 8)*group,
            random = ~ I(age - 8) | subject, data=Blackmore, method="ML")
summary(bm.lme.1)
## Linear mixed-effects model fit by maximum likelihood
##  Data: Blackmore 
##       AIC     BIC    logLik
##   3615.31 3654.12 -1799.655
## 
## Random effects:
##  Formula: ~I(age - 8) | subject
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 1.4344611 (Intr)
## I(age - 8)  0.1626179 -0.278
## Residual    1.2440829       
## 
## Fixed effects: log.exercise ~ I(age - 8) * group 
##                              Value  Std.Error  DF   t-value p-value
## (Intercept)             -0.2761593 0.18197173 712 -1.517594  0.1296
## I(age - 8)               0.0641167 0.03128611 712  2.049367  0.0408
## grouppatient            -0.3536492 0.23477497 229 -1.506333  0.1334
## I(age - 8):grouppatient  0.2396503 0.03930757 712  6.096798  0.0000
##  Correlation: 
##                         (Intr) I(g-8) grpptn
## I(age - 8)              -0.489              
## grouppatient            -0.775  0.379       
## I(age - 8):grouppatient  0.389 -0.796 -0.489
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -2.7391759 -0.4321273  0.1230948  0.5309238  2.6410291 
## 
## Number of Observations: 945
## Number of Groups: 231
# 2. Fixed: age, group & interaction; Random: intercept only
bm.lme.2 <- update(bm.lme.1, random = ~ 1 | subject)
anova(bm.lme.1, bm.lme.2) # test for random slopes
##          Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## bm.lme.1     1  8 3615.310 3654.120 -1799.655                        
## bm.lme.2     2  6 3628.822 3657.929 -1808.411 1 vs 2 17.51181   2e-04
# 3. Fixed: age, group & interaction; Random: age only
bm.lme.3 <- update(bm.lme.1, random = ~ I(age - 8) - 1 | subject)
anova(bm.lme.1, bm.lme.3) # test for random intercepts
##          Model df      AIC      BIC    logLik   Test L.Ratio p-value
## bm.lme.1     1  8 3615.310 3654.120 -1799.655                       
## bm.lme.3     2  6 3818.907 3848.015 -1903.454 1 vs 2 207.597  <.0001

Patients and controls did not differ in time spent exercising at age 8 (p = 0.1327). There was a small, statistically significant increase in exercise for the control group (p = 0.0384), and a considerably larger increase among patients. The trends for the two groups differed significantly (p < 0.00005). Figure 8 of Fox’s article would be a very helpful additon to a manuscript that tried to explain the differences between the two groups.

# figure 8
# fit regressions of log.exercise on age for each subject
pat.list <- lmList(log.exercise ~ I(age - 8) | subject,
                   subset = group=="patient", data=Blackmore)
con.list <- lmList(log.exercise ~ I(age - 8) | subject,
                   subset = group=="control", data=Blackmore)
pat.coef <- coef(pat.list)
con.coef <- coef(con.list)
old <- par(mfrow=c(1, 2))
boxplot(pat.coef[,1], con.coef[,1], main="Intercepts",
          names=c("Patients", "Controls"))
boxplot(pat.coef[,2], con.coef[,2], main="Slopes",
          names=c("Patients", "Controls"))

par(old)
  1. Make a plot like Figure 9 based on model bm.lme.3. You may use the code from the article, or come up with your own ggplot2 solution.
# calculate fitted values for patients and controls across the range of ages
# We first make a new prediction dataset, with ages running from 8 to 18 for both groups
# Then we use the predict command, with level=0 for fixed effects
pdata <- expand.grid(age=seq(8, 18, by=2), group=c("patient", "control"))
pdata$log.exercise <- predict(object=bm.lme.3, newdata=pdata, level=0)
pdata$exercise <- 2^pdata$log.exercise - 5/60
pdata
##    age   group log.exercise  exercise
## 1    8 patient  -0.63962586 0.5585461
## 2   10 patient  -0.02877357 0.8969199
## 3   12 patient   0.58207873 1.4136713
## 4   14 patient   1.19293103 2.2028340
## 5   16 patient   1.80378333 3.4080126
## 6   18 patient   2.41463563 5.2485146
## 7    8 control  -0.27778534 0.7415229
## 8   10 control  -0.14955184 0.8181971
## 9   12 control  -0.02131835 0.9019986
## 10  14 control   0.10691515 0.9935897
## 11  16 control   0.23514865 1.0936947
## 12  18 control   0.36338215 1.2031049
# Plot the fitted values "by hand" (Fig 9)
plot(pdata$age, pdata$exercise, type="n",
      xlab="Age (years)", ylab="Exercise (hours/week)")
points(pdata$age[1:6], pdata$exercise[1:6], type="b", pch=19, lwd=2)
points(pdata$age[7:12], pdata$exercise[7:12], type="b", pch=22, lty=2, lwd=2)
legend("topleft", c("Patients", "Controls"), pch=c(19, 22),
       lty=c(1,2), lwd=2, inset=0.05)

  1. Write a “statistical methods” second in which you describe, in a few sentences, how the results for the 3rd model (bm.lme.3) were obtained. (For now: do not worry about explaining how you chose model 3.) Be as concise - yet complete - as possible.

Answers may vary. My version: “Time spent exercising was log (base 2) transformed due to right skewness and heterogeneity of variances. A linear mixed effects model was used to account for multiple measurements per subject. Fixed effects were added for age (centered at age=8), group and an age*group interaction; a random intercept per girl was added. For interpreation, predicted values were transformed back to the original scale and plotted.”

Exercise 3

The data contained in the file stroke.Rdata are from an experiment to promote the recovery of stroke patients. There were three experimental groups:

  1. was a new occupational therapy intervention;
  2. was the existing stroke rehabilitation program conducted in the same hospital where A was conducted;
  3. was the usual care regime for stroke patients provided in a different hospital.

There were 8 patients in each experimental group. The response variable was a measure of functional ability, the Barthel index: higher scores correspond to better outcomes and the maximum score is 100. Each program lasted for 8 weeks. All subjects were evaluated at the start of the program and at weekly intervals until the end of the program. The hypothesis was that the patients in group A would do better than those in group B or C.

  1. Thinking about the design of the study (and without yet looking at the data), what approach(es) would you use to model this data? Think about both the fixed part of the model (to answer the research question) and the random part of the model (to account for correlated measurements).

The research question is a bit vague. If we simply want to know whether patients score higher on one treatment than on another, we can use just main effects for time and treatment. If the experiment is randomized and there is a suspicion that everyone begins at about the same point, but treatment A improves more quickly, then we are interested in a tx*time interaction. One dilemma is how to model time: if the course of Barthel index over time is linear, a LME (random intercept and random effect for time ) would be a good option; if it is not, a CPM might be a better model.

Measurements are at equally spaced intervals and measurements closer together are likely to be more correlated than measurements further apart, so a CPM with heterogeneous AR(1) is a good option. A CPM with compound symmetry (or a LME model with only random intercept) is an unlikely candidate, and a CPM with an unstructured correlation matrix is a terrible choice, because it would require 8 + 28 = 36 parameters!

  1. How would you treat the first Barthel index evaluation?

It is not clear if the first Barthel index measurement is pre-randomization (in fact, it is not even clear if this was a randomized experiment), so it might be better to use the first Barhel measurement as an outcome rather than a covariate. On the other hand, in the description we say “at the start of the program”, and you may think that sounds like a measurement that cannot yet be affected by the program. In that case, you could choose to control for “baseline” Barthel index.

  1. Get descriptive statistics of the measurements and examine correlations of measurements over time.
######################Analysis of stroke data ########################
load(file = "stroke.Rdata")
stroke
##    Subject Group week1 week2 week3 week4 week5 week6 week7 week8
## 1        1     A    45    45    45    45    80    80    80    90
## 2        2     A    20    25    25    25    30    35    30    50
## 3        3     A    50    50    55    70    70    75    90    90
## 4        4     A    25    25    35    40    60    60    70    80
## 5        5     A   100   100   100   100   100   100   100   100
## 6        6     A    20    20    30    50    50    60    85    95
## 7        7     A    30    35    35    40    50    60    75    85
## 8        8     A    30    35    45    50    55    65    65    70
## 9        9     B    40    55    60    70    80    85    90    90
## 10      10     B    65    65    70    70    80    80    80    80
## 11      11     B    30    30    40    45    65    85    85    85
## 12      12     B    25    35    35    35    40    45    45    45
## 13      13     B    45    45    80    80    80    80    80    80
## 14      14     B    15    15    10    10    10    20    20    20
## 15      15     B    35    35    35    45    45    45    50    50
## 16      16     B    40    40    40    55    55    55    60    65
## 17      17     C    20    20    30    30    30    30    30    30
## 18      18     C    35    35    35    40    40    40    40    40
## 19      19     C    35    35    35    40    40    40    45    45
## 20      20     C    45    65    65    65    80    85    95   100
## 21      21     C    45    65    70    90    90    95    95   100
## 22      22     C    25    30    30    35    40    40    40    40
## 23      23     C    25    25    30    30    30    30    35    40
## 24      24     C    15    35    35    35    40    50    65    65
# descriptive statistics
by(stroke[,3:10], stroke$Group, describe)
## stroke$Group: A
##       vars n  mean    sd median trimmed   mad min max range  skew kurtosis   se
## week1    1 8 40.00 26.59   30.0   40.00 14.83  20 100    80  1.30     0.34 9.40
## week2    2 8 41.88 25.63   35.0   41.88 14.83  20 100    80  1.31     0.42 9.06
## week3    3 8 46.25 23.72   40.0   46.25 11.12  25 100    75  1.30     0.42 8.39
## week4    4 8 52.50 22.99   47.5   52.50 11.12  25 100    75  0.90    -0.40 8.13
## week5    5 8 61.88 21.37   57.5   61.88 14.83  30 100    70  0.33    -1.02 7.56
## week6    6 8 66.88 18.89   62.5   66.88 11.12  35 100    65  0.11    -0.76 6.68
## week7    7 8 74.38 21.12   77.5   74.38 14.83  30 100    70 -0.88    -0.24 7.47
## week8    8 8 82.50 16.04   87.5   82.50 11.12  50 100    50 -0.85    -0.61 5.67
## ------------------------------------------------------------ 
## stroke$Group: B
##       vars n  mean    sd median trimmed   mad min max range  skew kurtosis   se
## week1    1 8 36.88 14.87   37.5   36.88 11.12  15  65    50  0.39    -0.74 5.26
## week2    2 8 40.00 15.35   37.5   40.00 11.12  15  65    50  0.08    -1.10 5.43
## week3    3 8 46.25 22.48   40.0   46.25 18.53  10  80    70  0.04    -1.33 7.95
## week4    4 8 51.25 22.64   50.0   51.25 25.95  10  80    70 -0.41    -1.14 8.00
## week5    5 8 56.88 24.78   60.0   56.88 29.65  10  80    70 -0.59    -1.08 8.76
## week6    6 8 61.88 24.19   67.5   61.88 25.95  20  85    65 -0.41    -1.53 8.55
## week7    7 8 63.75 24.31   70.0   63.75 25.95  20  90    70 -0.49    -1.34 8.60
## week8    8 8 64.38 24.27   72.5   64.38 22.24  20  90    70 -0.56    -1.27 8.58
## ------------------------------------------------------------ 
## stroke$Group: C
##       vars n  mean    sd median trimmed   mad min max range skew kurtosis   se
## week1    1 8 30.62 11.16   30.0   30.62 11.12  15  45    30 0.07    -1.71 3.95
## week2    2 8 38.75 17.06   35.0   38.75 11.12  20  65    45 0.66    -1.35 6.03
## week3    3 8 41.25 16.42   35.0   41.25  7.41  30  70    40 0.91    -1.19 5.81
## week4    4 8 45.62 21.12   37.5   45.62  7.41  30  90    60 1.12    -0.38 7.47
## week5    5 8 48.75 22.95   40.0   48.75  7.41  30  90    60 0.88    -1.14 8.11
## week6    6 8 51.25 24.89   40.0   51.25 14.83  30  95    65 0.80    -1.24 8.80
## week7    7 8 55.62 26.38   42.5   55.62 14.83  30  95    65 0.60    -1.57 9.33
## week8    8 8 57.50 28.03   42.5   57.50 11.12  30 100    70 0.65    -1.50 9.91
# correlations of Barthel index over time
round(cor(stroke[,3:10],use="pairwise.complete.obs"),digits=2)
##       week1 week2 week3 week4 week5 week6 week7 week8
## week1  1.00  0.93  0.88  0.83  0.79  0.71  0.62  0.55
## week2  0.93  1.00  0.92  0.88  0.85  0.79  0.70  0.64
## week3  0.88  0.92  1.00  0.95  0.91  0.85  0.77  0.70
## week4  0.83  0.88  0.95  1.00  0.92  0.88  0.83  0.77
## week5  0.79  0.85  0.91  0.92  1.00  0.97  0.91  0.88
## week6  0.71  0.79  0.85  0.88  0.97  1.00  0.96  0.93
## week7  0.62  0.70  0.77  0.83  0.91  0.96  1.00  0.98
## week8  0.55  0.64  0.70  0.77  0.88  0.93  0.98  1.00
  1. Make a spaghetti plot of the data (don’t forget to restructure the data!).
# Reshape data (to "long" format) for mixed models analysis
stroke.long <- reshape(stroke, idvar="Subject", varying=list(3:10), direction="long")
names(stroke.long)
## [1] "Subject" "Group"   "time"    "week1"
colnames(stroke.long) <- c("Subject", "Group", "week", "Barthel")
summary(stroke.long)
##     Subject         Group                week         Barthel      
##  Min.   : 1.00   Length:192         Min.   :1.00   Min.   : 10.00  
##  1st Qu.: 6.75   Class :character   1st Qu.:2.75   1st Qu.: 35.00  
##  Median :12.50   Mode  :character   Median :4.50   Median : 45.00  
##  Mean   :12.50                      Mean   :4.50   Mean   : 52.37  
##  3rd Qu.:18.25                      3rd Qu.:6.25   3rd Qu.: 70.00  
##  Max.   :24.00                      Max.   :8.00   Max.   :100.00
# Spaghetti plots
p <- ggplot(data = stroke.long, aes(x = week, y = Barthel, group = Subject))
p + geom_line() + facet_grid(. ~ Group)

  1. Fit the model you think would best describe the patterns in the data.

From both the descriptive statistics and the graphs, it would appear that patients on treatment A have higher average scores; however it also seems that these patients started higher. The standard deviations of Barthel index increase over time for groups B and C, but decrease over time for group A. The graph gives us an indication of why: one person in group A scores 100 at every time point. The spaghetti plot indicates that using a linear mixed effects model would not be inappropriate, while the correlations indeed suggest either a LME with random intercept + slope or a CPM with heterogeneous AR(1) correlation.

Because modelling time as a factor uses considerably more degrees of freedom, I would choose the LME with time as continuous. In a randomized trial, my interest is always in differing trends over time for the groups, so in a possible treatmenttime interaction. Therefore: fixed effects for time, group and timegroup, random intercept and time effect per subject. Note that I model all measurements as outcomes here; you may get slightly different results if you choose to control for baseline instead.

# Make week into a factor variable, time is a copy of week but continuous
stroke.long$time <- stroke.long$week
stroke.long$week <- as.factor(stroke.long$week)

# LME with random intercept + random slope, fixed effects: tx, time, tx*time
str.lme.ris <- lme(fixed=Barthel ~ time*Group, random=~time|Subject, data=stroke.long, method="ML")
summary(str.lme.ris)
## Linear mixed-effects model fit by maximum likelihood
##  Data: stroke.long 
##        AIC      BIC   logLik
##   1368.448 1401.023 -674.224
## 
## Random effects:
##  Formula: ~time | Subject
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 19.622859 (Intr)
## time         2.744509 -0.368
## Residual     5.181279       
## 
## Fixed effects: Barthel ~ time * Group 
##                 Value Std.Error  DF   t-value p-value
## (Intercept) 29.821429  7.196377 165  4.143951  0.0001
## time         6.324405  1.026834 165  6.159128  0.0000
## GroupB       3.348214 10.177213  21  0.328991  0.7454
## GroupC      -0.022321 10.177213  21 -0.002193  0.9983
## time:GroupB -1.994048  1.452163 165 -1.373157  0.1716
## time:GroupC -2.686012  1.452163 165 -1.849663  0.0662
##  Correlation: 
##             (Intr) time   GroupB GroupC tm:GrB
## time        -0.396                            
## GroupB      -0.707  0.280                     
## GroupC      -0.707  0.280  0.500              
## time:GroupB  0.280 -0.707 -0.396 -0.198       
## time:GroupC  0.280 -0.707 -0.198 -0.396  0.500
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -2.896236549 -0.448975316  0.004636154  0.462924261  3.170944869 
## 
## Number of Observations: 192
## Number of Groups: 24
getVarCov(str.lme.ris, type="marginal")
## Subject 1 
## Marginal variance covariance matrix
##        1      2      3      4      5      6      7      8
## 1 379.79 340.65 328.36 316.07 303.78 291.49 279.19 266.90
## 2 340.65 362.74 331.13 326.37 321.62 316.86 312.10 307.34
## 3 328.36 331.13 360.75 336.68 339.45 342.23 345.00 347.77
## 4 316.07 326.37 336.68 373.83 357.29 367.60 377.90 388.21
## 5 303.78 321.62 339.45 357.29 401.98 392.97 410.81 428.64
## 6 291.49 316.86 342.23 367.60 392.97 445.18 443.71 469.08
## 7 279.19 312.10 345.00 377.90 410.81 443.71 503.46 509.51
## 8 266.90 307.34 347.77 388.21 428.64 469.08 509.51 576.80
##   Standard Deviations: 19.488 19.046 18.993 19.335 20.049 21.099 22.438 24.017

Though groups B and C have lower slopes than group A (negative interactions with time), the time*group interaction is not statistically significant. To plot the predictions, we first need to get the predictions, then graph them (I use ggplot2, but you could also use xyplot).

# Plot predicted values on basis of the model:
# Again, first get predictions, then plot them (here I use ggplot2)
pdata <- expand.grid(time=seq(1:8), Group=c("A", "B", "C"))
pdata$pBarthel <- predict(object=str.lme.ris, newdata=pdata, level=0)
p2 <- ggplot(data = pdata, aes(x = time, y = pBarthel, group = Group))
p2 + geom_point(aes(colour=Group)) + geom_line()

  1. Summarize and interpret the results in part (e).

The average Barthel index is 29.8 at week 0 in group A (reference group); the average for B was 3.3 points higher and for C 0.02 points lower, though these differences do not appear statistically significant. According to this model, the trend observed for group A is an increase of about 6.3 points per week; while the trend for B is 6.3 – 2.0 = 4.3 points per week and for C 6.3 – 2.7 = 3.6 points per week. There is some indication of a time*group interaction but insufficient evidence (p for interaction is 0.1613); therefore it would be more appropriate to report one time trend (slope) for all groups. The model should be re-fit without the interaction to get the average increase per week for all groups. The variation in random intercepts is quite large compared to the residual variance: there is clearly quite a bit of inter-patient variation (both in starting level and in slope over time), which is why 8 patients per group do not provide sufficient power to detect as statistically significant the observed difference in trend over time for the 3 groups.

Session Info

devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value                       
##  version  R version 4.0.3 (2020-10-10)
##  os       macOS Catalina 10.15.7      
##  system   x86_64, darwin17.0          
##  ui       X11                         
##  language (EN)                        
##  collate  nl_NL.UTF-8                 
##  ctype    nl_NL.UTF-8                 
##  tz       Europe/Amsterdam            
##  date     2020-11-25                  
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version date       lib source        
##  abind         1.4-5   2016-07-21 [1] CRAN (R 4.0.2)
##  assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.0.2)
##  backports     1.2.0   2020-11-02 [1] CRAN (R 4.0.2)
##  broom         0.7.2   2020-10-20 [1] CRAN (R 4.0.2)
##  callr         3.5.1   2020-10-13 [1] CRAN (R 4.0.2)
##  car         * 3.0-10  2020-09-29 [1] CRAN (R 4.0.2)
##  carData     * 3.0-4   2020-05-22 [1] CRAN (R 4.0.2)
##  cellranger    1.1.0   2016-07-27 [1] CRAN (R 4.0.2)
##  cli           2.2.0   2020-11-20 [1] CRAN (R 4.0.2)
##  colorspace    2.0-0   2020-11-11 [1] CRAN (R 4.0.2)
##  cowplot       1.1.0   2020-09-08 [1] CRAN (R 4.0.2)
##  crayon        1.3.4   2017-09-16 [1] CRAN (R 4.0.2)
##  curl          4.3     2019-12-02 [1] CRAN (R 4.0.1)
##  data.table    1.13.2  2020-10-19 [1] CRAN (R 4.0.2)
##  desc          1.2.0   2018-05-01 [1] CRAN (R 4.0.2)
##  devtools      2.3.2   2020-09-18 [1] CRAN (R 4.0.2)
##  digest        0.6.27  2020-10-24 [1] CRAN (R 4.0.2)
##  dplyr         1.0.2   2020-08-18 [1] CRAN (R 4.0.2)
##  ellipsis      0.3.1   2020-05-15 [1] CRAN (R 4.0.2)
##  evaluate      0.14    2019-05-28 [1] CRAN (R 4.0.1)
##  fansi         0.4.1   2020-01-08 [1] CRAN (R 4.0.2)
##  farver        2.0.3   2020-01-16 [1] CRAN (R 4.0.2)
##  forcats       0.5.0   2020-03-01 [1] CRAN (R 4.0.2)
##  foreign     * 0.8-80  2020-05-24 [2] CRAN (R 4.0.3)
##  fs            1.5.0   2020-07-31 [1] CRAN (R 4.0.2)
##  gdata       * 2.18.0  2017-06-06 [1] CRAN (R 4.0.2)
##  generics      0.1.0   2020-10-31 [1] CRAN (R 4.0.2)
##  ggplot2     * 3.3.2   2020-06-19 [1] CRAN (R 4.0.2)
##  ggpubr        0.4.0   2020-06-27 [1] CRAN (R 4.0.2)
##  ggsignif      0.6.0   2019-08-08 [1] CRAN (R 4.0.2)
##  glue          1.4.2   2020-08-27 [1] CRAN (R 4.0.2)
##  gtable        0.3.0   2019-03-25 [1] CRAN (R 4.0.2)
##  gtools        3.8.2   2020-03-31 [1] CRAN (R 4.0.2)
##  haven         2.3.1   2020-06-01 [1] CRAN (R 4.0.2)
##  hms           0.5.3   2020-01-08 [1] CRAN (R 4.0.2)
##  htmltools     0.5.0   2020-06-16 [1] CRAN (R 4.0.2)
##  knitr         1.30    2020-09-22 [1] CRAN (R 4.0.2)
##  labeling      0.4.2   2020-10-20 [1] CRAN (R 4.0.2)
##  lattice       0.20-41 2020-04-02 [2] CRAN (R 4.0.3)
##  lifecycle     0.2.0   2020-03-06 [1] CRAN (R 4.0.2)
##  magrittr      2.0.1   2020-11-17 [1] CRAN (R 4.0.3)
##  Matrix        1.2-18  2019-11-27 [2] CRAN (R 4.0.3)
##  memoise       1.1.0   2017-04-21 [1] CRAN (R 4.0.2)
##  mgcv          1.8-33  2020-08-27 [2] CRAN (R 4.0.3)
##  mnormt        2.0.2   2020-09-01 [1] CRAN (R 4.0.2)
##  munsell       0.5.0   2018-06-12 [1] CRAN (R 4.0.2)
##  nlme        * 3.1-149 2020-08-23 [2] CRAN (R 4.0.3)
##  openxlsx      4.2.3   2020-10-27 [1] CRAN (R 4.0.2)
##  pillar        1.4.7   2020-11-20 [1] CRAN (R 4.0.2)
##  pkgbuild      1.1.0   2020-07-13 [1] CRAN (R 4.0.2)
##  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.0.2)
##  pkgload       1.1.0   2020-05-29 [1] CRAN (R 4.0.2)
##  prettyunits   1.1.1   2020-01-24 [1] CRAN (R 4.0.2)
##  processx      3.4.4   2020-09-03 [1] CRAN (R 4.0.2)
##  ps            1.4.0   2020-10-07 [1] CRAN (R 4.0.2)
##  psych       * 2.0.9   2020-10-05 [1] CRAN (R 4.0.2)
##  purrr         0.3.4   2020-04-17 [1] CRAN (R 4.0.2)
##  R6            2.5.0   2020-10-28 [1] CRAN (R 4.0.2)
##  Rcpp          1.0.5   2020-07-06 [1] CRAN (R 4.0.2)
##  readxl        1.3.1   2019-03-13 [1] CRAN (R 4.0.2)
##  remotes       2.2.0   2020-07-21 [1] CRAN (R 4.0.2)
##  rio           0.5.16  2018-11-26 [1] CRAN (R 4.0.2)
##  rlang         0.4.8   2020-10-08 [1] CRAN (R 4.0.2)
##  rmarkdown     2.5     2020-10-21 [1] CRAN (R 4.0.3)
##  rprojroot     2.0.2   2020-11-15 [1] CRAN (R 4.0.2)
##  rstatix       0.6.0   2020-06-18 [1] CRAN (R 4.0.2)
##  scales        1.1.1   2020-05-11 [1] CRAN (R 4.0.2)
##  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.0.2)
##  stringi       1.5.3   2020-09-09 [1] CRAN (R 4.0.2)
##  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.0.2)
##  testthat      3.0.0   2020-10-31 [1] CRAN (R 4.0.2)
##  tibble        3.0.4   2020-10-12 [1] CRAN (R 4.0.2)
##  tidyr         1.1.2   2020-08-27 [1] CRAN (R 4.0.2)
##  tidyselect    1.1.0   2020-05-11 [1] CRAN (R 4.0.2)
##  tmvnsim       1.0-2   2016-12-15 [1] CRAN (R 4.0.2)
##  usethis       1.6.3   2020-09-17 [1] CRAN (R 4.0.2)
##  vctrs         0.3.5   2020-11-17 [1] CRAN (R 4.0.2)
##  withr         2.3.0   2020-09-22 [1] CRAN (R 4.0.2)
##  xfun          0.19    2020-10-30 [1] CRAN (R 4.0.2)
##  yaml          2.2.1   2020-02-01 [1] CRAN (R 4.0.2)
##  zip           2.1.1   2020-08-27 [1] CRAN (R 4.0.2)
## 
## [1] /Users/dveen5/Library/R/4.0/library
## [2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library